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Abstract 

We prove tight bounds of <d(k\ogk) queries for non-adaptively testing whether a function 
/ : {0, 1}™ — > {0, 1} is a fc-parity or far from any fc-parity. Both upper and lower bounds are 
new. The lower bound combines a recent method of Blais, Brody and Matulef [BBM11] to 
get testing lower bounds from communication complexity, with a new 0(fclogfc) bound for the 
one-way communication complexity of fc-disjointness. 

1 Introduction 

A parity is a function / : {0, l} n — > {0, 1} that can be written as f(y) = {x,y), the inner product 
(mod 2) of y with some fixed string x. We also sometimes denote this function by / = x* . We call 
/ a k-parity if x has Hamming weight k. We consider the following testing problem: 

Let 1 < k < n be integers. Given oracle access to a Boolean function / : {0, l} n —¥ {0, 1}, 
how many queries to / do we need to test (i.e., determine with probability > 2/3) 
whether / is a A:-parity or far from any £>parity? 

Here a function / is far from a set of functions G, if for all g € G, the functions / and g differ 
on at least a constant fraction of their domain {0, l} n (for concreteness one can take this constant 
to be 1/10). Let PAR£ denote the set of all /c-parities on n-bit inputs, PAR< fc = l%fcPAR™, and 
PAR™ = PAR< n . 

Another way of looking at the problem is as determining, by making as few queries as possible 
to the Hadamard encoding of a word x, whether \x\ = k or not. So the task is essentially how to 
decide if |x| = k efficiently if we can query the XOR of arbitrary subsets of the bits of xo 

It is easy to see that deciding if the size of a parity is k is the same problem as deciding if it 
is n — k. For even n, the case k = n/2 is particularly interesting because it enables us to verify 
the equality between the sizes of two unknown parities f,gG PAR™. Indeed, define a parity on 2n 
variables by h{x\X2) = f(l n © x\) © g(x2), where xi,X2 £ {0, l} n ; then h G PAR^ ra if and only if / 
and g are parities of the same size. 

A related problem is deciding if a parity has size at most k (naturally, this is equivalent to 
deciding if the size is at least n — k, or at most n — k — 1). Upper bounds for this task imply 
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upper bounds for testing fc-parities (one can perform one test to verify the condition |x| < k and 
another one for |x| < k — 1). Lower bounds here do not immediately imply lower bounds for 
testing isomorphism, but they (and so lower bounds for testing fc-parities) do imply lower bounds 
for testing fc-juntas (because one way of checking if / G PAR< fc is testing that / is linear and also 
a /c-junta). 

The first step towards analyzing the hardness of these problems was taken by Goldreich |GollO[ 
Theorem 4], who proved that testing if a linear function / € PAR n (n even) is in PAR< n / 2 re- 
quires Q,{- s /n) queries. Goldreich conjectured that the true bound should be 0(n). Later Blais et 
al. [BBM11] showed that testing if a function / is a A;-parity requires £l(k) queries. 

In this paper we focus on non-adaptive testing, where all queries to / are chosen in advance. 
Our main results are tight upper and lower bounds of Q(klogk) non-adaptive queries for testing 
whether / is in or far from the set PARjJ. Section [2] describes our upper bound and Section [3] 
describes our lower bound, which is based on a new (and tight) Q(klogk) bound for the one-way 
communication complexity of the /c-disjointness problem. After obtaining our results, we learned 
that this same communication complexity result has also independently been obtained by Dasgupta, 
Kumar and Sivakumar [DKS12]. 

2 Upper bounds 

Here we prove that O(klogk) queries suffice to non-adaptively test if / : {0, l} n — > {0,1} is a 
/c-parity. We assume k = w(l), as the result is easy to establish if k = 0(1). First we show a tester 
for the special case n = 100/c 2 , and then we show how the general case reduces to this special case. 
The basic ingredient we need is the influence test (see also |FKR + 04| ). 

Claim 1. Influence test Let f : {0,1}" — > {0,1} be a parity function with J C [n] being the 
set of its influential variables. There is a probabilistic procedure If : {0, l} n — > {0, 1} that when 
executed on input x G {0, 1}™ (corresponding to a set x C [n]) satisfies the following: 

• If makes at most 7 queries to f ; 

• if x n J = then If returns 0; 

• if x n J 7^ then If returns 1 with probability at least 99/100. 

In other words, If(x) is a probabilistic predicate (with one-sided error) checking if x and J intersect. 

The influence test can be made more robust, to handle functions / that are only close to being 
parities, by increasing the query-complexity (per test) and switching to two-sided error: 

Claim 2. Noisy influence test Let f : {0, l} n — > {0,1} be 1/10-close to a parity function 
g : {0,1}™ — > {0,1} with influential variables J C [n]. There is a probabilistic procedure if : 
{0, l} n — > {0, 1} that when executed on input x G {0, l} n satisfies the following: 

• ijy makes at most 210 queries to f; 

• if x n J = then if returns with probability at least 49/50; 

• if x n J ^ then if returns 1 with probability at least 49/50. 
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In other words, ij (x) is a probabilistic predicate checking if x and J (the influential variables of 
the parity function g closest to f) intersect. 

Proof. We use the self-correction property of the Hadamard code: 

Pr [g{x) = f{y) ® f{y ® x)] > 1 - 2 ■ dist(/, g) > 4/5. 
ye{o,i} n 

This allows us to correctly decode the value of g on any given input with probability 1 — 1/700 
using 30 queries. Hence, by the union bound, any 7 values (for a single application of the usual 
influence test) can be decoded correctly with probability 1/100. Now use the tester from Claim [1] 
and observe that the overall error probability is at most 1/100 + 1/100 = 1/50. □ 

So, prior to testing if / is a /c-parity, we test it for being a parity function with proximity 
parameter 1/10 and confidence parameter 99/100 (this can be done with a constant number of 
queries [BLR90J). If this test fails then we reject; otherwise, we assume / is 1/10-close to being a 
parity function and condition all further probabilities accordingly. 

2.1 Testing in the case where n = 100/c 2 

In the following test we set q = ^jp-klogk, with p G (0, 1] being a constant defined later. 

• Draw r%, . . . , r q G {0, l} n at random, by setting r%j to 1 with probability p/k for each i G [q] 
and j G [n], independently of the others. For each j G [n], denote by S 3 C [q] the set of 
indices i G [q] with rij = 1. 

• Compute di if (ri) for all i G [q] with the noisy influence test of Claim [2j For each j G [n] 
denote by S{ the subset of S 3 containing indices i with Oj = 1. 

• Output the subset J C [n] containing the indices j for which > and accept if and 
only if | J| = k. 

The next claim says that with high probability, all influential variables of /* (the parity function 
closest to /) are inserted in J. 

Claim 3. With probability 1 — o(l) the following conditions are simultaneously satisfied: 

• \S 3 \ > 100 log k for every j G [n]; 

• l^il > il^'l f or every j G J. 

Proof. Apply standard concentration bounds to prove the first item (note that the expectation of 
IS' 3 1 is 1000 log k). Then, conditioned on it, use Claim [2] and another application of a concentration 
bound to get the second item. □ 

The next claim says that when |J| < k, with high probability none of the non-influential 
variables are inserted in J. Before we proceed, let us call an index i G [q] intersecting w.r.t. J if 
ri n J ^ 0. Recall that the probability of any one element of J belonging to r, is p/k; therefore the 
probability that i is not intersecting is (1 — p/k)^^ > (1 — p/k) k . We set the constant p so that for 
a fixed J of size at most k, the probability that i is intersecting is at most 1/10, for each i. 
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Claim 4. If \J\ < h, then with probability 1 — o(l) i/ie following conditions are simultaneously 
satisfied: 

• > 100 log k for every j G [n]; 

• /or every j ^ J, i/ie fraction of non-intersecting indices in\S J \ is > 1/2. 

Here, too, the proof follows by straightforward application of standard concentration bounds. 

To conclude the correctness of the tester, observe that by Claim [3l with high probability J C J, 
so if J contains more than k indices, then so will J. On the other hand, if | J\ < k then by Claim HI 
with high probability all sets S 3 with j ^ J contain a majority of non-intersecting indices. For 
each non-intersecting j E 5 J , it holds that ai is with probability > 49/50. But in order for 
j to belong to J, at least three quarters of the indices i G S 3 must have Oj = 1, which implies 
that at least half of the non- intersecting indices i of S 3 must have etj = 1. As there are at least 
\S 3 \/2 > 50 log k non-intersecting indices in S 3 , standard concentration estimates show that this 
happens with probability at most k~ c for some c > 2. Since we are in the case n = 100k 2 , this 
probability is o(l/n) for each j £ [n], and we can apply the union bound to conclude that the 
success probability of the tester is 1 — o(l). 

Note that the test does more than testing: it actually identifies the set J of influential variables 
as long as it is of size < k. 

2.2 Reducing the general case to n — 100A; 2 

Lemma 5. Let k > 100 and n > 100k 2 . Given a subset J C [n] and a 100k 2 -way partition 
II = Si, . . . , 5ioo£: 2 of [n], we denote by N(U, J) the number of classes Si containing an odd number 
of elements from J. The following holds for randomly constructed partitions U: 

• for each J C [n] of size \ J\<k, Prn[iV(n, J) = \J\] > 9/10, 

• for each J C [n] of size \ J\ > k, Pr n [iV(n, J) > k) > 9/10. 

Proof. Assume n > 100/c 2 (otherwise the trivial partition with singleton and empty classes satisfies 
both conditions). If | J| < k, then by a birthday-paradox type argument, with probability > 9/10 
no pair of indices from J belong to the same partition class, and hence N(I1, J) = \ J\. 

Now let | J\ > k. Consider the stage in the construction of the random partition II where all but 
the last k + 1 elements from J were mapped to one of IPs classes. If at this stage iV(n, J) > 2k + 2 
then we are done (since adding k + 1 indices from J can only change iV(II, J) by k + 1). Otherwise, 
we use a birthday-paradox type argument again to show that with probability 9/10 no pair from a 
set of < 3k + 2 elements collides when randomly mapped to lOOfc 2 classes. □ 

Once such a partition is obtained^] we can simulate access to a function /' : {0, l} iUUfc {0, 1} 
by querying / on inputs that are constant within each partition class, and reduce the original 
problem of testing / to the problem of testing whether /' is a /c-parity. 

Putting everything together, we have proved our upper bound: 

Theorem 6. There exists a non-adaptive tester that uses O(klogk) queries to a given function 
f : {0, l} n — > {0, 1}, and decides with probability at least 2/3 whether f is in or far from PARJ!. 

2 We condition all calculations in Section [27T] on this event, which occurs with probability 9/10. 
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3 Lower bounds 



3.1 The one-way communication complexity of A>disjointness 

In two-party communication complexity [Yao791lKN97| . two parties (Alice and Bob) have inputs 
x and y, respectively, and want to compute some function of x and y. Unlimited access to their 
respective inputs and arbitrary computations are allowed, and the measure for the protocol's effi- 
ciency is the number of bits of communication they need to transmit to each other. We consider 
the model where Alice and Bob are share a common source of randomness ("public coin') and are 
allowed to err with probability at most 1/3. 

In the k-disjointness problem, Alice and Bob receive two A;-sets x,y 6 (^) and would like to 
determine if xDy = or not. Furthermore, they are guaranteed that either xPiy = or |xny| = 1. 
This problem is known to have communication complexity @(k). The upper bound is due to Hastad 
and Wigderson I IW07I . the lower bound due to Kalyanasundaram and Schnitger, and subsequent 
simplifications and strengthenings were found by Razborov |Raz92j and Bar-Yossef et al. [BJ KS04] , 
The Hastad- Wigderson protocol is interactive (i.e., it uses many rounds of communication), and 
we show here this is actually necessary: if we just allow one-way communication from Alice to Bob, 
then the lower bound goes up from fi(fc) to £l(klogk) bits. 

Theorem 7. The one-way communication complexity of the k-disjointness problem is Q(klogk) 
for k < ^/n~j2, and G(log (")) for k > y/n/2. 

Proof. For the upper bound, first note that Alice can just send Bob the index of her input x in 
the set of all weight-fc strings of length n, at the expense of log (^) bits. If k < \fn/2 then we can 
do something better, as follows. Alice and Bob use the shared randomness to choose a random 
partition of their inputs into b = 0(k 2 ) buckets, each of size n/b. By similar birthday paradox 
arguments as before, with probability close to 1 no two 1-positions in x will end up in the same 
bucket, and no two 1-positions in y will end up in the same bucket. We condition the remainder 
of the upper-bound argument on this successful bucketing. Note that x and y intersect iff there 
is an i 6 [b] such that Alice and Bob's strings in the ith bucket are equal and non-zero. For each 
of her k non-empty buckets, Alice sends Bob the index of that bucket, and uses the well-known 
public-randomness equality protocol on that bucket: they choose 2 log A: uniformly random strings 
r i , . . . , r2 log k 6 {0, l} n//fc and Alice sends over the inner products (mod 2) of her bucket with each 
of those strings. Bob compares the bits he received with the inner products of r%, . . . , f2iogit with 
his corresponding bucket. If their two buckets are the same then all inner products will be the 
same, and if their two buckets differ in at least one bit-position then they will see a difference in 
those inner products, except with probability l/2 21ogfc = 1/k 2 . Bob checks whether one of Alice's 
non-empty buckets equals his corresponding bucket. If so he concludes that x and y intersect, and 
otherwise he concludes that they are disjoint. Taking the union bound over the probability that 
the bucketing fails and the probability that one of the k equality tests fails, shows that the error 
probability is close to 0. The communication cost of this one-way protocol is 0(log k) bits for each 
of Alice's non-empty buckets, so 0{k\ogk) bits in total. 

For the lower bound, first consider the case k < yn/2. Let x be Alice's input, viewed as an n-bit 
string of Hamming weight k. For Alice we restrict our attention to inputs of a particular structure. 
Namely, partition [n] into k consecutive sets of size n/k > 2k. The inputs we allow contain precisely 
one bit set to 1 inside each block of the partition, and moreover the offset of the unique index set 
to one within the ith block is an integer in {0, . . . , 2k — 1}. In this case, x describes a message M 
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of k integers mi, ■ ■ ■ , m&, each in the interval {0, . . . ,2k — 1}. M can also be viewed as an m-bit 
long message, where m = fclog(2fc). We can write Alice's input as x = u(m\) . . .u(mk), where 
u(mi) e {0, l} n/h is the unary expression of the number mj using n/k bits (where the rightmost 
n/k — k bits of each u(rrn) are always zero). For instance, the picture below illustrates the case 
where n = 40, k = 4, and M = (1,7,0,5): 

n/k n/k n/k n/k 

x = 0100000000 0000000100 1000000000 0000010000 

u(ttn) u(m,2) u{ms) u(m4 : ) 

Let p x be the g-bit message that Alice sends on this input; this is a random variable, depending 
on the public coin. Below we show that the message is a random- access code for M, i.e., it allows 
a user to recover each bit of M with probability at least 1 — 5 (though not necessarily all bits 
of M simultaneously). Then our lower bound will follow from Nayak's random-access code lower 
bound |Nay99| . This says that 

q > (1 - H(S))m, 

where 5 is the error probability of the protocol and H(8) = —5 log(<5) — (1 — 5) log(l — 5) is its binary 
entropy. 

Suppose Bob is given p x and wants to recover some bit of M. Say this bit is the ith bit of the 
binary expansion of m^. Then Bob completes the protocol using the following y: y is everywhere 
except on the k bits in the ith block of size n/k whose offsets j (measured from the start of the 
block) satisfy the following: < j < 2k and the ith bit of the binary expansion of j is 1. The 
Hamming weight of y is k by definition. 

Recall that Alice has a 1 in block i only at position m\. Hence x and y will intersect iff the ith 
bit of the binary expansion of m-j is 1, and moreover, the size of the intersection is either or 1. 
Running the fc-disjointness protocol with confidence 1 — 5 will now give Bob the sought-for bit of 
M with probability at least 1 — 5, which shows that p x is a random-access code for M. 

Uk> y/nj2 then we can do basically the same lower-bound proof, except that the integers m% 
are now in the interval {0, . . . , n/k — 1}, m = klog(n/k), and Bob puts only n/2k < k ones in the 
ith block of y (he can put his remaining k — n/2k indices somewhere at the end of the block, at an 
agreed place where Alice won't put Is). This gives a lower bound of 0,(klog(n/k)) = fi( log □ 

We note that the lower bound holds even for quantum one-way communication complexity, and 
even if we allow Alice and Bob to share entanglement. For the latter case Nayak's random access- 
code lower bound |Nay99| needs to be replaced with Klauck's [KlaOO] version, which is weaker by 
a factor of two. 

3.2 Non-adaptive lower bound for testing £>parities 

In a recent paper, Blais, Brody and Matulef [BBMllJ made a clever connection between property 
testing and some well-studied problems in communication complexity. As one of the applications 
of this connection, they used the Q(k) lower bound for /c-disjointness to prove an Q(k) lower bound 
on testing whether a function is in or far from the class of fc-parities. We use their argument to get 
a better lower bound for non-adaptive testers: 
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Corollary 8. Let 1 < k < n. If k < \Jnj2 then non-adaptive testers need at least £l(k\ogk) 
queries to test with success probability at least 2/3 whether a given function f : {0, l} n — > {0, 1} is 
in or far from PARJ?; and if k > \Jnj1 then they need at least f2(log (^)) queries. 

Proof. Let k be even (a similar argument works for odd k). Below we show how Alice and Bob 
can use a non-adaptive (/-query tester for /c-parities to get a one-way public-coin communication 
complexity for /c/2-disjointness with q bits of communication. The communication lower bound of 
Theorem [7J then implies the result. 

Alice forms the function f = x* and Bob forms the function g = y*. Consider the function 
h = (x © y)* . Since \x © y\ = + \y\ — 2\x n y\, the function h is a fc-parity if x n y = 0, and a 
(fc — 2)-parity if \x D y| = 1. A g-query randomized tester is a probability distribution over g-query 
deterministic testers. Alice and Bob use the public coin to jointly sample one of those deterministic 
testers. Since the tester is non-adaptive, this fixes the q queries that will be made. For every such 
query z £ {0, 1}", Alice sends Bob the bit f(z). This enables Bob to compute h(z) = f(z)(Bg(z) for 
all q queries and then to finish the computation of the tester. Since a (k — 2)-parity has distance 1/2 
from every /c-parity, the tester will tell Bob whether h is a /c-parity or a (k — 2)-parity; equivalently, 
whether x and y intersect or not. □ 

As mentioned in the introduction, a lower bound for testing membership in PAR]? implies a 
lower bound for PAR< fc and juntas. 

4 Conclusion and future work 

We end with a few comments and directions for future research: 

• While our disjointness lower bound (Theorem [7|) also applies to one-way quantum protocols, 
our lower bound for testing (Corollary [8]) does not. The reason is that the overhead when 
turning a quantum tester into a communication protocol will be 0(n) qubits per query in the 
quantum case, in contrast to the 0(1) bits per query in the classical case. In fact, if / is a 
/c-parity then the Bernstein- Vazirani algorithm [BV97] finds x itself using only one quantum 
query, so testing for fc-parities is trivial for quantum algorithms. 

• For adaptive testers for ^-parities there is still a gap between the best lower bound of £l(k) 
queries and the best upper bound of 0{k\ogk) queries. It would be interesting to close this 
gap. 
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